Two vocoder techniques for neutral to emotional timbre conversion
نویسندگان
چکیده
In this paper, we describe the application of two vocoder techniques for an experiment of spectral envelope transformation. We processed speech data in a neutral standard reading style in order to reproduce the spectral shapes of two emotional speaking styles: happy and sad. This was achieved by means of conversion functions which operate in the frequency domain and are trained with aligned source-target pairs of spectral features. The first vocoder is based on the source-filter model of speech production and exploits the Mel Log Spectral Approximation filter, while the second is the Phase vocoder. Objective distance measures were calculated in order to evaluate the effectiveness of the conversion framework in predicting the target spectral envelopes. Subjective listening tests also provided interesting elements for the evaluation.
منابع مشابه
New Phase-vocoder Techniques for Pitch-shifting, Harmonizing and Other Exotic Effects
The phase-vocoder is usually presented as a high-quality solution for time-scale modification of signals, pitch-scale modifications usually being implemented as a combinationof timescaling and sampling rate conversion [1]. In this paper, we present two new phase-vocoder-based techniques which allow direct manipulation of the signal in the frequency-domain, enabling such applications as pitch-sh...
متن کاملStatistical singing voice conversion with direct waveform modification based on the spectrum differential
This paper presents a novel statistical singing voice conversion (SVC) technique with direct waveform modification based on the spectrum differential that can convert voice timbre of a source singer into that of a target singer without using a vocoder to generate converted singing voice waveforms. SVC makes it possible to convert singing voice characteristics of an arbitrary source singer into ...
متن کاملSpeaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs
Speaking style conversion is the technology of converting natural speech signals from one style to another. In this study, we focus on normal-to-Lombard conversion. This can be used, for example, to enhance the intelligibility of speech in noisy environments. We propose a parametric approach that uses a vocoder to extract speech features. These features are mapped using Bayesian GMMs from utter...
متن کاملA cross-vocoder study of speaker independent synthetic speech detection using phase information
Current speaker verification systems are vulnerable to advanced speech manipulation techniques such as voice conversion and speaker adaptation for TTS systems. Effective anti-spoofing systems that allow the discrimination between human and synthetic impostors have been developed. However, many of them still present two main drawbacks: speaker dependency and, more importantly, counterfeiting tec...
متن کاملSpectral Correlates in Emotion Labeling of Sustained Musical Instrument Tones
Music is one of the strongest inducers of emotion in humans. Melody, rhythm, and harmony provide the primary triggers, but what about timbre? Do the musical instruments have underlying emotional characters? For example, is the well-known melancholy sound of the English horn due to its timbre or to how composers use it? Though music emotion recognition has received a lot of attention, researcher...
متن کامل